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Draft Genome Sequence of Clostridium pasteurianum NRRL B-598, a 
Potential Butanol or Hydrogen Producer 
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We present a draft genome sequence of Clostridium pasteurianum NRRL B-598. This strain ferments saccharides by two-stage 
acetone-butanol (AB) fermentation, is oxygen tolerant, and has high hydrogen yields. 
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The strain Clostridium pasteurianum NRRL B-598 is a spore- 
forming, anaerobic, mesophilic, heterofermentative, rod- 
shaped (young cells are motile) bacterium that differs from the 
recently sequenced C. pasteurianum DSM 525 ( 1 ), especially in its 
inability to utilize glycerol as a substrate and its negligible forma- 
tion of ethanol and production of acetone instead of 1,3- 
propanediol. This strain has been used in only a few studies (2-7); 
however, it might be a useful platform for further genetic modifi- 
cation because it is not sensitive to oxygen, has versatile sugar- 
fermenting and proteolytic abilities, seems to be genetically stable 
in comparison with other Clostridia, and tolerates minor changes 
in fermentation conditions. 

Based on DNA isolation, no plasmids were present and only 
chromosomal DNA was obtained. For C. pasteurianum NRRL 
B-598, a single-end library was sequenced with the GS Junior 
System (Roche). Two sequencing runs were performed. The 
sequence reads from both runs were assembled with a GS De 
Novo Assembler 2.9 (Roche), which provided the most accept- 
able assembly. Annotation was added by the NCBI Prokaryotic 
Genome Annotation Pipeline (PGAP) (http://www.ncbi.nlm 
.nih.gov/genome/annotation_prok/). ProSplign (http://www 
.ncbi.nlm.nih.gov/sutils/static/prosplign/prosplign.html) and 
GeneMarkS+ (8) were used for open reading frame (ORF) detec- 
tion; tRNAscan-SE (9) was used for tRNA prediction, and rRNAs 
were predicted by a sequence similarity search using BLAST 
against an RNA sequence database and/or using Infernal and 
Rfam models. The G+C content was calculated using the draft 
genome sequence. The resulting draft genome sequence of C. pas- 
teurianum NRRL B-598 comprises 6,041,878 bases that are split 
into 138 contigs. The G+C content is 29.6%. In total, 5,547 genes 
were predicted by PGAP, including 5,367 protein-coding se- 
quences (CDSs). The genome of C. pasteurianum NRRL B-598 is 
larger than that of type strain Clostridium pasteurianum DSM 525 
(4.29 Mb) (1) as well as those of other solvent producers, e.g., 
Clostridium acetobutylicum ATCC 824 (4.13 Mb) (10) and Clos- 
tridium acetobutylicum DSM 1731 (11), but smaller than that of 
Clostridium saccharoperbutylacetonicum Nl-4 (6.67 Mb) (12). In 



total, 29 rRNA and 76 tRNA genes were identified in the genome 
sequence. 

The genome will be subjected to thorough gene mining in the 
near future; however, some interesting genes have already been 
identified, e.g., the spoOA gene coding for protein sporulation ini- 
tiator or catalase and superoxide dismutase genes corresponding 
with oxygen tolerance. Also, genes involved in solvent production 
(aid, ctfA, ctfB, and adc) have been identified. Genes are probably 
clustered in operons, and all of them are highly similar to equiva- 
lent genes which were found in the genome of Clostridium beijer- 
inckii NCIMB 8052. 

Nucleotide sequence accession numbers. Data from this 
whole-genome shotgun project have been deposited at DDBJ/ 
EMBL/GenBank under the accession no. AYXROOOOOOOO. Ver- 
sion AYXROIOOOOOO is described in this paper. 
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